Search CORE

220 research outputs found

GPT detectors are biased against non-native English writers

Author: Liang Weixin
Mao Yining
Wu Eric
Yuksekgonul Mert
Zou James
Publication venue
Publication date: 18/04/2023
Field of study

The rapid adoption of generative language models has brought about substantial advancements in digital communication, while simultaneously raising concerns regarding the potential misuse of AI-generated content. Although numerous detection methods have been proposed to differentiate between AI and human-generated content, the fairness and robustness of these detectors remain underexplored. In this study, we evaluate the performance of several widely-used GPT detectors using writing samples from native and non-native English writers. Our findings reveal that these detectors consistently misclassify non-native English writing samples as AI-generated, whereas native writing samples are accurately identified. Furthermore, we demonstrate that simple prompting strategies can not only mitigate this bias but also effectively bypass GPT detectors, suggesting that GPT detectors may unintentionally penalize writers with constrained linguistic expressions. Our results call for a broader conversation about the ethical implications of deploying ChatGPT content detectors and caution against their use in evaluative or educational settings, particularly when they may inadvertently penalize or exclude non-native English speakers from the global discourse

arXiv.org e-Print Archive

Information Bottleneck Revisited: Posterior Probability Perspective with Optimal Transport

Author: Bai Bo
Chen Lingyi
Sun Yining
Wu Hao
Wu Huihui
Wu Shitong
Ye Wenhao
Zhang Wenyi
Publication venue
Publication date: 22/08/2023
Field of study

Information bottleneck (IB) is a paradigm to extract information in one target random variable from another relevant random variable, which has aroused great interest due to its potential to explain deep neural networks in terms of information compression and prediction. Despite its great importance, finding the optimal bottleneck variable involves a difficult nonconvex optimization problem due to the nonconvexity of mutual information constraint. The Blahut-Arimoto algorithm and its variants provide an approach by considering its Lagrangian with fixed Lagrange multiplier. However, only the strictly concave IB curve can be fully obtained by the BA algorithm, which strongly limits its application in machine learning and related fields, as strict concavity cannot be guaranteed in those problems. To overcome the above difficulty, we derive an entropy regularized optimal transport (OT) model for IB problem from a posterior probability perspective. Correspondingly, we use the alternating optimization procedure and generalize the Sinkhorn algorithm to solve the above OT model. The effectiveness and efficiency of our approach are demonstrated via numerical experiments.Comment: ISIT 202

arXiv.org e-Print Archive

A modified airfoil-based piezoaeroelastic energy harvester with double plunge degrees of freedom

Author: Da Ronch Andrea
Li Daochun
Wu Yining
Xiang Jinwu
Publication venue: 'Elsevier BV'
Publication date
Field of study

In this letter, a piezoaeroelastic energy harvester based on an airfoil with double plunge degrees of freedom is proposed to additionally take advantage of the vibrational energy of the airfoil pitch motion. An analytical model of the proposed energy harvesting system is built and compared with an equivalent model using the well-explored pitch-plunge configuration. The dynamic response and average power output of the harvester are numerically studied as the flow velocity exceeds the cut-in speed (flutter speed). It is found that the harvester with double-plunge configuration generates 4%–10% more power with varying flow velocities while reducing 6% of the cut-in speed than its counterpart

Southampton (e-Prints Soton)

Determination of Residual Oil Distribution after Water Flooding and Polymer Flooding

Author: Fengpeng Lai
Man Teng
Xiaodong Wu
Yining Wang
Yining Wang
Zhaopeng Yang
Publication venue
Publication date: 01/01/2013
Field of study

Abstract: In this study, we want to seek for the results from a study on a reservoir with a single sand body and vertical segmentation to simulate each sand layer individually by using FCM and Petrel software. The results indicated that the black oil simulator E100, the Cartesian coordinate system, the angular point grid and the full implicit solution were used in historical fitting. And plane by 50 for step, the plane was divided into six grids and was vertically divided into six simulation layers. After grid coarsening, the total is 181170. For the oil reservoir block, the fitting error of the cumulative oil production history is 8.65% and the fitting error of the moisture content is 3.42%. For single-well oil production, the mean error is 7.36%, and the mean fitting error of the moisture content is 4.37%. The residual oil remained on top of the thick oil reservoir channel sand after water flooding and is 50.63% of the total surplus geological reserves. Thus, water flooding can improve oil recovery in highly permeable zones. After polymer flooding in a thick reservoir with a top layer of channel sand, the residual oil was 39.26%, which is 11.37% lower than that after water flooding

CiteSeerX

A HINT from Arithmetic: On Systematic Generalization of Perception, Syntax, and Semantics

Author: Hong Yining
Huang Siyuan
Li Qing
Wu Ying Nian
Zhu Song-Chun
Zhu Yixin
Publication venue
Publication date: 01/03/2021
Field of study

Inspired by humans' remarkable ability to master arithmetic and generalize to unseen problems, we present a new dataset, HINT, to study machines' capability of learning generalizable concepts at three different levels: perception, syntax, and semantics. In particular, concepts in HINT, including both digits and operators, are required to learn in a weakly-supervised fashion: Only the final results of handwriting expressions are provided as supervision. Learning agents need to reckon how concepts are perceived from raw signals such as images (i.e., perception), how multiple concepts are structurally combined to form a valid expression (i.e., syntax), and how concepts are realized to afford various reasoning tasks (i.e., semantics). With a focus on systematic generalization, we carefully design a five-fold test set to evaluate both the interpolation and the extrapolation of learned concepts. To tackle this challenging problem, we propose a neural-symbolic system by integrating neural networks with grammar parsing and program synthesis, learned by a novel deduction--abduction strategy. In experiments, the proposed neural-symbolic system demonstrates strong generalization capability and significantly outperforms end-to-end neural methods like RNN and Transformer. The results also indicate the significance of recursive priors for extrapolation on syntax and semantics.Comment: Preliminary wor

arXiv.org e-Print Archive